Automatic Evaluation of Quality of an Explanatory Dictionary by Comparison of Word Senses
نویسندگان
چکیده
Words in the explanatory dictionary have different meanings (senses) described using natural language definitions. If the definitions of two senses of the same word are too similar, it is difficult to grasp the difference and thus it is difficult to judge which of the two senses is intended in a particular contexts, especially when such a decision is to be made automatically as in the task of automatic word sense disambiguation. We suggest a method of formal evaluation of this aspect of quality of an explanatory dictionary by calculating the similarity of different senses of the same word. We calculate the similarity between two given senses as the relative number of equal or synonymous words in their definitions. In addition to the general assessment of the dictionary, the individual suspicious definitions are reported for possible improvement. In our experiments we used the Anaya explanatory dictionary of Spanish. Our experiments show that there are about 10% of substantially similar definitions in this dictionary, which indicates rather low quality.
منابع مشابه
Automatically Linking GermaNet to Wikipedia for Harvesting Corpus Examples for GermaNet Senses
The comprehension of a word sense is much easier when its usages are illustrated by example sentences in linguistic contexts. Hence, examples are crucially important to better understand the sense of a word in a dictionary. The goal of this research is the semi-automatic enrichment of senses from the German wordnet GermaNet with corpus examples from the online encyclopedia Wikipedia. The paper ...
متن کاملInvited Talk: The Relevance of a Cognitive Model of the Mental Lexicon to Automatic Word Sense Disambiguation
Supervised word sense disambiguation requires training corpora that have been tagged with word senses, and these word senses typically come from a pre-existing sense inventory. Space limitations imposed by dictionary publishers have biased the field towards lists of discrete senses for an individual lexeme. Although some dictionaries use hierarchical entries to emphasize relations between sense...
متن کاملSubcategorization Acquisition as an Evaluation Method for WSD
Evaluation of word sense disambiguation (WSD) systems is often based on machine-readable dictionaries (MRDs). Such evaluation typically employs a set of fine-grained dictionary senses and considers them all to be equally important. In this paper, we propose a novel evaluation method for WSD systems in the context of automatic subcategorization acquisition. Building on an extant subcategorizatio...
متن کاملConstructing Word-Sense Association Networks from Bilingual Dictionary and Comparable Corpora
A novel thesaurus named a word-sense association network is proposed for the first time. It consists of nodes representing word senses, each of which is defined as a set consisting of a word and its translation equivalents, and edges connecting topically associated word senses. This word-sense association network is produced from a bilingual dictionary and comparable corpora by means of a new...
متن کاملUnsupervised and Minimally Supervised Learning of Lexical Semantics Proceedings of the Workshop
Supervised word sense disambiguation requires training corpora that have been tagged with word senses, and these word senses typically come from a pre-existing sense inventory. Space limitations imposed by dictionary publishers have biased the field towards lists of discrete senses for an individual lexeme. This approach does not capture information about relatedness of individual senses. How i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003